Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove duplicate coordinates with indexes on sub-trees #9531

Merged
merged 3 commits into from
Sep 26, 2024

Conversation

shoyer
Copy link
Member

@shoyer shoyer commented Sep 22, 2024

This is a partial solution to the duplicate coordinates issue from #9475.

Here we remove all duplicate coordinates between parent and child nodes with an index (these are already checked for equality via the alignment check).

Other repeated coordinates (which we cannot automatically check for equality) are still allowed for now. We will need an alternative solution for these, as discussed in #9475, but it is less obvious what the right solution is so I'm holding off on it for now.

  • Tests added

This is a _partial_ solution to the duplicate coordinates issue from pydata#9475.

Here we remove all duplicate coordinates between parent and child nodes
with an index (these are already checked for equality via the alignment
check).

Other repeated coordinates (which we cannot automatically check for
equality) are still allowed for now. We will need an alternative
solution for these, as discussed in pydata#9475, but it is less obvious
what the right solution is so I'm holding off on it for now.
@shoyer shoyer marked this pull request as ready for review September 22, 2024 20:29
@TomNicholas TomNicholas added the topic-DataTree Related to the implementation of a DataTree class label Sep 22, 2024
Copy link
Contributor

@TomNicholas TomNicholas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this completely supercede #9510? If so how exactly does it get around the problem mentioned in #9510 (comment)?

This PR looks good, but I'm not sure we should merge it until we know what the rest of the solution to #9475 looks like.

xarray/core/datatree.py Outdated Show resolved Hide resolved
xarray/tests/test_datatree.py Show resolved Hide resolved
xarray/core/datatree.py Outdated Show resolved Hide resolved
@shoyer
Copy link
Member Author

shoyer commented Sep 22, 2024

Does this completely supercede #9510? If so how exactly does it get around the problem mentioned in #9510 (comment)?

This PR looks good, but I'm not sure we should merge it until we know what the rest of the solution to #9475 looks like.

This PR intentionally punts on the issue of duplicate inherited non-index coordinates for now, preserving the current behavior (keeping them duplicated). I think it is an incremental improvement, and any full solution in #9475 will include this.

The behavior of duplicate non-indexed coordinates will also need another fix. In my mind, there are two leading contenders, but I would like to discuss it a bit more:

@shoyer
Copy link
Member Author

shoyer commented Sep 26, 2024

@TomNicholas are you comfortable merging this PR for now? I think it will be a part of whichever final solution we pick

@shoyer shoyer merged commit fecaa85 into pydata:main Sep 26, 2024
28 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic-DataTree Related to the implementation of a DataTree class
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants